Add NNDSS Infectious Weekly scripts and schema mappings#1973
Add NNDSS Infectious Weekly scripts and schema mappings#1973abhishekjaisw wants to merge 3 commits intodatacommonsorg:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new data import pipeline for CDC WONDER NNDSS Infectious Weekly data, including a preprocessing script for MMWR date calculation and a comprehensive property-value mapping file. Feedback focuses on correcting critical data mapping errors in nndss_weekly_pvmap.csv, such as the incorrect use of 52-week maximum values instead of current weekly counts and the misclassification of Hantavirus as Haemophilus. Additionally, corrections are needed for a broken relative path in the manifest and an incorrect data category description in the README.
| "provenance_description": "Notifiable Infectious Diseases Data: Weekly tables from CDC WONDER which has the incident counts of different infectious diseases per week that are reported by the 50 states, New York City, the District of Columbia, and the U.S. territories.", | ||
| "scripts": [ | ||
| "preprocess.py", | ||
| "python ../../tools/statvar_importer/stat_var_processor.py --input_data=input_files/NNDSS_Weekly_Data.csv --pv_map='nndss_weekly_pvmap.csv' --config_file=nndss_weekly_metadata.csv --output_path=output/nndss_weekly_output" |
There was a problem hiding this comment.
The relative path to stat_var_processor.py appears to be incorrect. Based on the directory structure, it should be three levels up. Additionally, ensure that file paths are not quoted unless strictly required.
| "python ../../tools/statvar_importer/stat_var_processor.py --input_data=input_files/NNDSS_Weekly_Data.csv --pv_map='nndss_weekly_pvmap.csv' --config_file=nndss_weekly_metadata.csv --output_path=output/nndss_weekly_output" | |
| "python ../../../tools/statvar_importer/stat_var_processor.py --input_data=input_files/NNDSS_Weekly_Data.csv --pv_map=nndss_weekly_pvmap.csv --config_file=nndss_weekly_metadata.csv --output_path=output/nndss_weekly_output" |
References
- Do not quote arguments that represent file paths in manifest.json scripts if they are not strictly required for the command to function correctly.
Schema and Data Refresh CL for CDCWonder_NNDSS_InfectiousWeekly. This CL is designed to refresh the schema and data for the CDCWonder_NNDSS_InfectiousWeekly dataset, ensuring the latest information is available for analysis and reporting.